In this paper, we will develop a systematic approach to deriving guaranteedbounds for approximate dynamic programming (ADP) schemes in optimal controlproblems. Our approach is inspired by our recent results on bounding theperformance of greedy strategies in optimization of string-submodular functionsover a finite horizon. The approach is to derive a string-submodularoptimization problem, for which the optimal strategy is the optimal controlsolution and the greedy strategy is the ADP solution. Using this approach, weshow that any ADP solution achieves a performance that is at least a factor of$\beta$ of the performance of the optimal control solution, which satisfiesBellman's optimality principle. The factor $\beta$ depends on the specific ADPscheme, as we will explicitly characterize. To illustrate the applicability ofour bounding technique, we present examples of ADP schemes, including thepopular rollout method.
展开▼